Negative association between multiple sclerosis immunogenetic profile and in silico immunogenicities of 12 viruses

Human Leukocyte Antigen (HLA) is involved in both multiple sclerosis (MS) and immune response to viruses. Here we investigated the virus-HLA immunogenicity (V-HLA) of 12 viruses implicated in MS with respect to 17 HLA Class I alleles positively associated to MS prevalence in 14 European countries. Overall, higher V-HLA immunogenicity was associated with smaller MS-HLA effect, with human herpes virus 3 (HHV3), JC human polyoma virus (JCV), HHV1, HHV4, HHV7, HHV5 showing the strongest association, followed by HHV8, HHV6A, and HHV6B (moderate association), and human endogenous retrovirus (HERV-W), HHV2, and human papilloma virus (HPV) (weakest association). These findings suggest that viruses with proteins of high HLA immunogenicity are eliminated more effectively and, consequently, less likely to be involved in MS.


Negative association between multiple sclerosis immunogenetic profile and in silico immunogenicities of 12 viruses
Lisa M. James 1,2,3 & Apostolos P. Georgopoulos 1,2,3,4* Human Leukocyte Antigen (HLA) is involved in both multiple sclerosis (MS) and immune response to viruses.Here we investigated the virus-HLA immunogenicity (V-HLA) of 12 viruses implicated in MS with respect to 17 HLA Class I alleles positively associated to MS prevalence in 14 European countries.Overall, higher V-HLA immunogenicity was associated with smaller MS-HLA effect, with human herpes virus 3 (HHV3), JC human polyoma virus (JCV), HHV1, HHV4, HHV7, HHV5 showing the strongest association, followed by HHV8, HHV6A, and HHV6B (moderate association), and human endogenous retrovirus (HERV-W), HHV2, and human papilloma virus (HPV) (weakest association).These findings suggest that viruses with proteins of high HLA immunogenicity are eliminated more effectively and, consequently, less likely to be involved in MS.
Multiple sclerosis (MS) is a chronic autoimmune inflammatory disease which affects the central nervous system and is characterized by multifocal demyelinating lesions, axonal loss, and atrophy 1 .MS is the most common neurological disorder among young adults and its global prevalence is increasing for unclear reasons 2,3 .The etiology of MS is uncertain although viruses have long been purported to contribute to the disease, particularly in genetically vulnerable individuals.For example, human herpes viruses (HHV) including Epstein-Barr virus (EBV/HHV4), roseolavirus (HHV6), and varicella zoster virus (VZV/HHV3) as well as human endogenous retroviruses (HERVs) have been commonly implicated in MS [4][5][6][7] .In addition, human polyoma JC virus (JCV), a human polyomavirus, is associated with MS and particularly in complications stemming from immunosuppressive treatment for MS [8][9][10] .The primary genetic influence on MS is attributed to human leukocyte antigen (HLA) genes which are centrally involved in the human immune response to viruses and other foreign antigens and have been implicated in both MS risk and protection [11][12][13][14][15][16] .In a recent immunogenetic epidemiological study, we evaluated the association between the population frequencies of 127 HLA alleles and the population prevalence of MS across 14 European countries and found a preponderance of negative (i.e., protective) associations between HLA allele frequencies and MS prevalence, particularly for Class I HLA alleles 16 .Given the role of HLA in elimination/suppression of viruses and other foreign antigens, we hypothesized that negative (i.e., protective) associations between Class I HLA and MS are likely attributable to superior pathogen elimination afforded by those alleles, and that, conversely, positive (i.e., susceptibility) HLA-MS associations may be attributable to insufficient immunogenetic protection against certain pathogens, thereby hindering their suppression and possibly contributing to downstream effects associated with MS.Here, in an effort to test this hypothesis and bridge separate lines of research implicating exposure to pathogens and HLA in MS, we evaluated the virus-HLA (V-HLA) immunogenicity of viruses implicated in MS with respect to HLA alleles that are positively associated with MS prevalence.

MS-HLA susceptibility scores
The MS-HLA Susceptibility scores are epidemiological measures of association between MS prevalence and HLA allele frequency.Of the 69 HLA Class I alleles investigated, 24 were positive, indicating a positive association

Immunogenicity of viral proteins for HLA Class I alleles
In silico virus-HLA immunogenicity scores (V-HLA scores) are estimates of T-cell epitope prediction, indicating the likelihood that the complex between a given epitope and a specific HLA Class I allele will engage T-cell receptor and, hence, activate CD8 + cytotoxic lymphocytes to kill the infected cell.V-HLA immunogenicity varied appreciably among the 12 viruses studied (Table 2, Fig. 2), being highest for HHV4 (V-HLA = 13.639) and lowest for HHV6A (V-HLA = 2.563), a 5.32 × differential.V-HLA was highest for allele C*03:03 (V-HLA = 12.686) and lowest for A*03:01 (V-HLA = 3.486) (Table 3, Fig. 3).

Association between MS-HLA susceptibility and V-HLA immunogenicity
Overall, MS-HLA susceptibility scores and V-HLA immunogenicity scores were negatively associated, such that MS-HLA susceptibility scores decreased as the V-HLA immunogenicity increased (Fig. 4; r = − 0.512, P = 0.035, N = 17), indicating a protective effect of viral immunogenicity.In order to evaluate the association of MS-HLA scores with V-HLA immunogenicity of individual viruses in a robust, uniform and nonparametric way, correlations were computed between data converted to normal scores using Blom's formula 17 1).1. (N = 12 viruses in Fig. 2).

Figure 4.
The MS-HLA susceptibility scores of the 17 alleles (Table 1) are plotted against the mean of the corresponding (per allele) V-HLA immunogenicity scores (N = 12 viruses).See text for details.

Figure 7.
Negative association of MS-HLA susceptibility scores of the 17 alleles (Table 1) vs. corresponding V-HLA immunogenicity scores for the viruses indicated (HHV8, HHV6A, HHV6B).See Table 4 for detailed statistics.www.nature.com/scientificreports/immunogenicity scores for each of the 12 viruses investigated.It can be seen that all associations were negative, such that MS-HLA susceptibility decreased as V-HLA immunogenicity increased, indicating a protective effect of the latter.The strength of this association differed across viruses (Fig. 9), as reflected in the order of the figures, with Fig. 5 illustrating the case with the strongest association (HHV3), Fig. 8 the case with the weakest association (HPV), and the rest (Figs. 6, 7) in between.Detailed association statistics are given in Table 4, where the strength of MS-HLA susceptibility vs. V-HLA immunogenicity is formalized as the percent of variance in MS-HLA susceptibility scores explained by the corresponding (to each allele) V-HLA immunogenicity.It can be seen (Table 4, Fig. 9) that HHV3 had the highest PVE (43.56%) and HPV the lowest (5.11%), a 8.52 × differential.

Discussion
It is largely accepted that MS is a result of complex genetic and environmental interactions.Here we focused on the role of viruses and HLA in MS.Specifically, we evaluated the association between immunogenicity of 12 viruses with respect to 17 HLA Class I alleles that we found to be associated with susceptibility to MS by analyzing population-level epidemiological data.Our findings documented a negative association between the viral V-HLA immunogenicity of all 12 viruses and MS-HLA susceptibility across the 17 MS-HLA Class I susceptibility alleles above.Although the strength of this association varied across viruses, the systematic negative association between viral V-HLA immunogenicity and MS-HLA susceptibility highlight a key role of HLA-mediated virus elimination and/or suppression in influencing MS risk, both at the initial infection and at later relapses caused by reactivation of a latent virus.
MS is presumed to result from exposure to ubiquitous infectious agents in the context of permissive genetic traits 18 .In addition to Class II HLA alleles that have long been implicated in MS 11 , the present findings suggest that the interaction between several common viruses including human herpes viruses and JCV with Class I HLA  www.nature.com/scientificreports/influences MS prevalence.In light of the role of HLA in antigen elimination and virus suppression, the effect of exposure to certain viruses on MS appears to be moderated by a given HLA allele's ability to bind and eliminate viral antigens that may otherwise contribute to MS or other conditions.Indeed, HHVs have been implicated in a number of human diseases including MS [4][5][6][7] .Following initial infection, typically in childhood, HHVs establish latency and may be periodically reactivated by various triggers and/or waning immunity.Notably, patterns of reactivation have been shown to correspond to MS relapse 19,20 .Similarly, JCV persists in a latent state in the brain, is detectable in human brain tissue, and has also been linked to MS 7,21,22 .The mechanisms underlying the influence of HLA on virus-MS associations are unclear, although several mechanisms including molecular mimicry, persistent viral antigens, bystander activation, superantigen activation, adjuvant effects, epitope spreading, and viral support of autoreactive cell survival have been proposed to explain how viruses might induce autoimmunity in MS 17,[23][24][25][26] .We have suggested that exposure to pathogens in the absence of HLA that can bind and eliminate those antigens results in antigen persistence and deleterious long-term effects including low-grade chronic inflammation and downstream autoimmunity, apoptosis, and atrophy, thereby setting the groundwork for various conditions including MS 16,27 .
With regard to specific viruses, the strongest effects observed here were for HHV3/VZV, JCV, HHV1/HSV1, HHV4/EBV, HHV7, and HHV5/CMV.Each of these viruses have been previously linked with MS although the findings have been somewhat inconsistent, even for EBV which is considered the leading viral candidate for MS 7,18,21,25,[28][29][30][31][32][33][34][35][36][37][38] .For instance, recent evidence demonstrated that although EBV antibodies were higher in MS patients than in controls, neither EBV antibodies nor salivary EBV DNA load were associated with radiological or clinical disease activity in patients with MS 39 .Like many HHVs, EBV is also commonly detected in the healthy adult population 40 suggesting infection with EBV or other HHVs is insufficient to cause MS in the absence of other factors, including HLA 41,42 .Furthermore, even among HLA alleles that were positively associated with MS risk in the present study, there was considerable variability in HLA-virus immunogencities, MS-HLA susceptibility scores, and their associations.

Additional contributions
In addition to the contributions of the Class I HLA-virus immunogenicities on MS susceptibility documented here, there are likely other contributing factors.Class II HLA has been strongly linked to MS risk 10 ; thus, it is likely that HLA Class II alleles, which are involved in formation of antibodies and immunological memory and often form haplotypes with other HLA alleles including those of Class I, contribute to MS and particularly to autoimmunity associated with MS 26 .Beyond viruses, several other environmental and lifestyle factors also appear to play a role in MS susceptibility including geography, smoking, sun exposure/vitamin D, and adolescent obesity [43][44][45] .Notably, some of these factors have been shown to interact with HLA to influence MS risk 43 .For example, smoking has been shown to increase the odds of MS in individuals lacking the protective HLA-A*02:01 allele or in carriers of the high-risk Class II HLA-DRB1*15:01 allele 46 .Similar interactions have been documented for obesity 47 .Thus, other HLA x environmental/lifestyle factor interactions not evaluated here may account for some of the unexplained variance in the HLA-MS profile.

Limitations
Our findings provide novel insights highlighting the interaction of viral exposure and host immunogenetics on MS; however, there are several study limitations that must be considered.First, the analyses here are based on MS diagnosis without regard to subtype; as such, it is unclear to what extent the present findings apply to different forms of the disease.Second, the data utilized here was derived from populations of Continental Western European countries and may not extend to other geographic locations given the global variation in HLA 48,49 , MS prevalence 2 , and virus-MS associations 50,51 .Third, it would be informative to evaluate immunogenicity of these viruses with regard to Class II HLA, particularly in light of the extensive literature documenting the relevance of Class II HLA in MS; however, we are not aware of any in silico application that allows for examination of both binding affinity and immunogenicity for Class II alleles akin to the approach we used here for Class I. Finally, we exclusively focused on the role of viruses in MS and on specific viral proteins, from several possible.The interplay between various environmental factors that have been linked to MS [43][44][45] and the HLA-related MS-viral associations remains to be investigated.

Prevalence of MS
The population prevalence of MS was computed for each of 14 countries in Continental Western Europe (Table 5).For each country, we identified the total number of people with each condition in 2019 from the Global Health Data Exchange 52 , a publicly available catalog of data from the Global Burden of Disease study, divided those values by the total population of each country in 2019 52 , and expressed the prevalence as percentage.

HLA alleles
We obtained the population frequency in 2019 of 69 common HLA Class I alleles from 14 Continental Western European Countries (Austria, Belgium, Denmark, Finland, France, Germany, Greece, Italy, Netherlands, Portugal, Norway, Spain, Sweden, and Switzerland) 53 .The alleles and their mean frequencies (across countries) are given in Table 6.

MS-HLA susceptibility scores
We computed the covariance between the prevalence of MS and the population frequency of the 69 HLA Class I alleles of Table 6: where f i , p i denote allele frequency and MS prevalence for the ith country, respectively, and f , p are their means.A positive covariance indicates a positive association between MS prevalence and allele frequency, indicating MS susceptibility.

Viral antigens
For a given allele, we estimated the immunogenicity of typical proteins of 12 viruses that have been implicated in MS to varying degrees, namely 9 human herpes virus species (HHV1-HHV8), human polyoma JC virus (JCV), human endogenous retrovirus (HERV-W), and human papilloma virus (HPV), the latter of which has not been implicated in MS, to our knowledge, and serves as a negative control, Details of the proteins analyzed are given in Table 7 and their amino acid (AA) sequences are given in the Appendix, together with a short description of their function.

Determination of immunogenicity of HLA Class I alleles
The INeo-Epp method 54 was used for T-cell receptor (TCR) epitope prediction using the INeo-Epp web tool via the INeo-Epp web form interface 55 .For that purpose, we split a given viral antigen (Table 6) to all possible 9-mer (nonamer) AA residue epitopes using a sliding window approach [56][57][58] (Fig. 10) and submitted each epitope to the web-application together with a specific HLA allele.More specifically, we paired all epitopes with all alleles and obtained for each pair its percentile rank, a measure of binding affinity of the epitope-HLA allele complex; smaller percentile ranks indicate higher binding affinity.The web-application gave as an outcome a TCR predictive score for pairs with high binding affinities (percentile rank < 2); scores > 0.4 indicated positive immunogenicity and were analyzed further.We computed the following as a comprehensive measure of immunogenicity for quantitative analyses.Let K be the number of nonamers that showed positive immunogenicity (score > 0.4); then, K weighted by their average score w , would serve as a good estimate of the overall effectiveness of a given allele, I, to induce immunogenicity for a given protein:

Association of V-HLA immunogenicities with MS-HLA susceptibility scores
We evaluated the association between MS-HLA susceptibility scores Eq. (1) and V-HLA immunogenicity scores Eq. ( 2) by computing the Pearson correlation between them for each HLA allele.The correlation coefficient obtained for each virus was squared and multiplied × 100 to provide the percent of MS-HLA susceptibility explained (PVE) by the viral protein immunogenicity:

Implementation of analysis procedures
The IBM-SPSS statistical package (version 27) was used for implementing standard statistical analyses, including descriptive statistics and measures of associations.Since we were testing explicitly only a negative association between virus immunogenicity and MS-HLA covariance, one-sided P-values were used.We did not correct for multiple comparisons because these were planned comparisons.
(1)  Figure 10.The sliding nonamer window approach used to determine exhaustively in silico the immunogenicity of all possible consecutive nonamers in a protein, illustrated here for HHV3.
Figure 1.MS-HLA susceptibility scores are plotted against their rank.Red, scores of alleles used I further analyses; gray, scores at the tail of the distribution, not used hereafter.The red line demarcates these two groups.See text for details.

Figure 9 .
Figure 9. Percent of MS-HLA susceptibility variance explained by V-HLA immunogenicity of the 12 viruses investigated.

Table 1 .
MS-HLA PScov scores for the 17 susceptibility Class I alleles investigated.

Table 2 .
Descriptive statistics of V-HLA immunogenicities across the 17 HLA Class I alleles in Table1(N = 17).SEM standard error of the mean.

Table 4 .
Association statistics between MS-HLA susceptibility scores and immunogenicities of the 12 viruses investigated (N = 17 HLA Class I alleles, Table1).r Pearson correlation, SE standard error of r, CI confidence interval, PVE percent of the MS-HLA susceptibility scores variance explained by virus immunogenicity.See text for details.

Table 5 .
Prevalence of multiple sclerosis in 14 CWE countries in 2019.

Table 6 .
The 69 HLA Class I alleles used and their mean frequencies.